1 import the data and setup

knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
library(sf)
## Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ tibble  3.1.6      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.0      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.1 
## ✔ purrr   0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(ggspatial)
library(viridis)
## Loading required package: viridisLite
library(tmap)
boulder <- st_read("/Users/lintianshu/Desktop/648Tem/Lab1/BoulderSocialMedia.shp")
## Reading layer `BoulderSocialMedia' from data source 
##   `/Users/lintianshu/Desktop/648Tem/Lab1/BoulderSocialMedia.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
mi <- st_read("/Users/lintianshu/Desktop/648Tem/Lab1/2010_Census_Tracts_(v17a)/2010_Census_Tracts_(v17a).shp")
## Reading layer `2010_Census_Tracts_(v17a)' from data source 
##   `/Users/lintianshu/Desktop/648Tem/Lab1/2010_Census_Tracts_(v17a)/2010_Census_Tracts_(v17a).shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 2773 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -90.41829 ymin: 41.69613 xmax: -82.41348 ymax: 48.26269
## Geodetic CRS:  WGS 84

2 Figures and the text explaining of it

2.1 Figure1

boulder %>%
  filter(DB ==  'Pano' | DB == 'Flickr') %>%
  ggplot(aes(x=DB, y=TrailH_Dis)) + 
  geom_boxplot()

Figure1 explanation: It is the distance from hiking trails and social media photographs. Like the method in the lecture, I use box plot to compare mean distance of these photography from the nearest hiking trails. The result is more signficant than the figure has been shown in lecture. The mean of Pano is higher than the Flickr value. And there are more outlier in Flickr data. There may be different reasons to explain this significant different, which may be due to the setting of different social media or different user preferences.

2.2 Figure2

boulder %>%
  filter(DB ==  'Pano' | DB == 'Flickr') %>%
  ggplot(aes(x=DB, y=NatMrk_Dis)) + 
  geom_violin()

Figure2 explanation: I plot the distance to natural landmark from social media photographs. I didn’t use box plot, instead of I use violin plot to compare it. I found that the photography location’s distance to the landmark are more “stable” in Pano than I flickr. Many flickr photos are distributed in a specific area (for example, 0 or 900). But Pano’s photos are more evenly distributed.

3.1 Geovisualizations and the text explaining of it

3.1 Geovisualization1

ggplot() +
    geom_sf(data = boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) + 
    scale_colour_gradientn(colours = viridis(10))

Explanation: Geovisualization1, elevation of this given area. This visualization is similar to that provided in the classroom. The difference is that I use the color scale of viridis. This is a friendly program for the visually impaired.

3.2 Geovisualizaiton2

tmap_mode("view")
## tmap mode set to interactive viewing
tmap_options(check.and.fix = TRUE)
tm_shape(mi) +
    tm_polygons("ACRES", style='quantile', legend.title = "Acres")
## Warning: The shape mi is invalid. See sf::st_is_valid

Explanation: Geovisualizaiton2, the acres of Administrative area of Michigan. I created the interactive map using data from Michigan open data. You can use this interative map to see different size(unit is acres) by Administrative area of michigan. A quick overview is that rural area usually have larger administrative area, while city area usualy have smaller administrative area.